Non contiguous-finished genome sequence and description of Peptoniphilus obesi sp. nov.

Peptoniphilus obesi strain ph1T sp. nov., is the type strain of P. obesi sp. nov., a new species within the genus Peptoniphilus. This strain, whose genome is described here, was isolated from the fecal flora of a 26-year-old woman suffering from morbid obesity. P. obesi strain ph1T is a Gram-positive, obligate anaerobic coccus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1,774,150 bp long genome (1 chromosome but no plasmid) contains 1,689 protein-coding and 29 RNA genes, including 5 rRNA genes.


Introduction
Peptoniphilus obesi strain ph1 T (=CSUR=P187, =DSM =25489) is the type strain of P. obesi sp. nov. This bacterium is a Gram-positive, anaerobic, indole-negative coccus that was isolated from the stool of a 26-year-old woman suffering from morbid obesity and is part of a study aiming at cultivating all species within human feces, individually [1]. Widespread use of gene sequencing, notably 16SrRNA, for the identification of bacteria recovered from clinical specimens, has enabled the description of a great number of bacterial species and genera of clinical importance [2,3]. The recent development of high throughput genome sequencing and mass spectrometric analyses has provided unprecedented access to a wealth of genetic and proteomic information [4]. The current classification of prokaryotes, known as polyphasic taxonomy, relies on a combination of phenotypic and genotypic characteristics [5]. However, as more than 3,000 bacterial genomes have been sequenced [6] and the cost of genomic sequencing is decreasing, we recently proposed to integrate genomic information in addition to their main phenotypic characteristics (habitat, Gramstain reaction, culture and metabolic characteristics, and when applicable, pathogenicity) in the description of new bacterial species [7][8][9][10][11][12][13][14][15][16][17][18]. The commensal microbiota of humans and animals consists, in part, of many Gram-positive anaerobic cocci. These bacteria are also commonly associated with a variety of human infections [19].
Extensive taxonomic changes have occurred among this group of bacteria, especially in clinically-important genera such as Finegoldia, Parvimonas, and Peptostreptococcus [20]. Members of genus Peptostreptococcus were divided into three new genera, Peptoniphilus, Anaerococcus and Gallicola by Ezaki [20]. The genus Peptoniphilus currently contains eight species that produce butyrate, are non-saccharolytic and use peptone and amino acids as major energy sources: P. asaccharolyticus, P. harei, P. indolicus, P. ivorii, P. lacrimalis [20], P. gorbachii, P. olsenii, and P. methioninivorax [21,22]. Members of the genus Peptoniphilus have been isolated mainly from various human clinical specimens such as vaginal discharges, ovarian, peritoneal, sacral and lachrymal gland abscesses [23]. In addition, P. indolicus causes summer mastitis in cattle [23]. Here we present a summary classification and a set of features for P. obesi sp. nov. strain ph1 T (CSUR=P187, DSM=25489) together with the description of the complete genomic sequence and its annotation. These characteristics support the circumscription of the species P. obesi.

Classification and features
A stool sample was collected from a 26-year-old woman living in Marseille (France), who suffered from morbid obesity: BMI = 48.2 (118.8 kg, 1.57 meter). At the time of stool sample collection, she was not a drug-user and was not on a diet. The Standards in Genomic Sciences patient gave an informed and signed consent, and the agreement of local ethics committee of the IFR48 (Marseille, France) were obtained under agreement 09-022. The fecal specimen was preserved at -80°C after collection. Strain ph1 T (Table  1) was isolated in 2011 by anaerobic cultivation on 5% sheep blood-enriched Columbia agar (BioMerieux, Marcy l'Etoile, France) after 26 days of preincubation of the stool sample in an anaerobic blood culture bottle enriched with sterile blood and rumen fluid. This strain exhibited a 91.0% nucleotide sequence similarity with P. asaccharolyticus and P. indolicus, the phylogenetically closest validated Peptoniphilus species (Figure 1). Among the validly published Peptoniphilus species, the percentage of 16S rRNA sequence similarity ranges from 86.0% (P. ivoriivs. P. olsenii) to 98.5% (P. asaccharolyticus vs. P. indolicus). Despite the fact that strain ph1 exhibited a 16SrRNA sequence similarity lower than the 95.0% cutoff, which is usually regarded as a threshold for the creation of new genus [2], we considered it as a new species within the Peptoniphilus genus. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [20]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. nov., a new species that we recently proposed, was also included in the analysis [12]. Anaerococcus prevotii was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.
Different growth temperatures (25, 30, 37, 45°C) were tested. Growth was observed between 30°C and 45°C, with optimal growth at 37°C. Colonies stained gray, transparent, opaque, non-bright and were 0.4 mm in diameter on blood-enriched Columbia agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMérieux), and in the presence of air, with or without 5% CO 2 . Optimal growth was achieved anaerobically, but no growth occurred in microaerophilic or aerobic conditions. A motility test was negative. Cells grown on agar are Grampositive ( Figure 2) and diameter ranged from 0.77µm to 0.93 µm with a mean diameter of 0.87 µm by electron microscopy ( Figure 3). Strain ph1 T exhibited neither catalase nor oxidase activities. Using the API rapid ID 32A system (BioMérieux), positive reactions were observed for arginine arylamidase and leucine arylamidase. Negative reactions were found for urease, nitrate reduction, arginine dihydrolase, indole production, α-arabinosidase, α-glucosidase, α-fucosidase, βgalactosidase, glutamic acid decarboxylase, 6phospho-β-galactosidase β-glucosidase, βglucuronidase, N-acetyl-β-glucosaminidase, Dmannose, D-raffinose, alkaline phosphatase, alanine arylamidase, glutamyl glutamic acid arylamidase, glycine arylamidase, histidine arylamidase, leucyl glycine arylamidase, phenylalanine arylamidase, proline arylamidase, pyroglutamic acid arylamidase, serine arylamidase and tyrosine arylamidase. P. obesi is susceptible to penicillin G, amoxicillin, amoxicillin + clavulanic acid, imipenem, nitrofurantoin, erythromycin, doxycyclin, rifampicine, vancomycin, gentamicin 500, metronidazole and resistant to ceftriaxon, ciprofloxacin, gentamicin 10 and trimetoprim + sulfamethoxazole. When compared with Peptoniphilus grossensis strain ph5 T , P. obesi sp. nov strain ph1 T exhibited phenotypic differences as no endospore formation, no indole, no tyrosine arylamidase, no histidine arylamidase production and this strain did not fermented D-mannose. P. obesi sp. nov strain ph1 T differed from Peptoniphilus timonensis strain JC401 T by endospore formation, catalase, indole, αgalactosidase, leucine arylamidase, tyrosine arylamidase, histidine arylamidase and serine arylamidase production. P. obesi sp. nov strain ph1 T differed from Peptoniphilus gorbachii strain WAL 10418 T by glutamyl glutamic acid, phenylalanine arylamidase, tyrosine arylamidase and glycine arylamidase production (Table 2).   Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [34]. Briefly, a pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Leipzig, Germany). Twelve distinct deposits were made for strain ph1 T from twelve isolated colonies. Each smear was overlaid with 2 µL of matrix solution (saturated solution of alpha-cyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoraceticacid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (IS1), 20 kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The twelve ph1 T spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria including spectra from 8 of the 11 validly published species of Peptoniphilus, that are part of the reference data contained in the BioTyper database. The method of identification included the m/z from 2,000 to 20,000 Da For every spectrum, 100 peaks at most were taken into account and compared with spectra in the database. A score enabled the identification, or not, from the tested species: a score > 2 with a validly published species enabled the identification at the species level, a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain ph1 T , the maximal obtained score was 1.25, thus suggesting that our isolate was not a member of a known species. We added the spectrum from strain ph1 T to our database for future reference (Figure 4). Finally, the gel view allows us to highlight the spectra differences with other of Peptoniphilus genera members ( Figure 5). Standards in Genomic Sciences  The Gel View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units.

Genome sequencing and annotation Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the genus Peptoniphilus, and is part of a study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the seventh genome of a Peptoniphilus species and the first genome of P. obesi sp. nov. A summary of the project information is shown in Table 3. The Genbank accession number is CAHB00000000 and consists of 32 contigs arranged in 5 scaffolds. Table 3 shows the project information and its association with MIGS version 2.0 compliance.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [35] with default parameters but the predicted ORFs were excluded if they spanned a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [36] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [37] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [38] and BLASTN against the GenBank database. Signal peptides and numbers of transmembrane helices were predicted using SignalP [39] and TMHMM [40], respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. To estimate the mean level of nucleotide sequence similarity at the genome level between Peptoniphilus obesi and other members of the Peptoniphilus genera, we compared genomes two by two and determined the mean percentage of nucleotide sequence identity among orthologous ORFs using BLASTn Orthologous genes were detected using the Proteinortho software [41].

Genome properties
The genome is 1,774,150 bp long (1 chromosome, but no plasmid) with a 30.10% G+C content (Table  4 and Figure 6). Of the 1,718 predicted genes, 1,689 were protein-coding genes and 29 were RNAs. A total of 1,278 genes (74.39%) were assigned a putative function. ORFans represented 4.9% (84 genes) of the predicted genes. The remaining genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 5 and Figure 6. The properties and the statistics of the genome are summarized in Tables 4 and 5. Genes with transmembrane helices 414 24.10 a The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome   Table 6 summarizes the numbers of orthologous genes and the average percentage of nucleotide sequence identity between the different genomes studied. The total is based on the total number of protein coding genes in the annotated genome

Conclusion
On the basis of phenotypic (Table 2), phylogenetic and genomic analyses (Table 6), we formally propose the creation of Peptoniphilus obesi sp. nov. that contains the strain ph1 T . This strain has been found in Marseille, France.